<html>
<head><meta charset="utf-8"><title>Writing a filesystem crate without regrets · general · Zulip Chat Archive</title></head>
<h2>Stream: <a href="https://rust-lang.github.io/zulip_archive/stream/122651-general/index.html">general</a></h2>
<h3>Topic: <a href="https://rust-lang.github.io/zulip_archive/stream/122651-general/topic/Writing.20a.20filesystem.20crate.20without.20regrets.html">Writing a filesystem crate without regrets</a></h3>

<hr>

<base href="https://rust-lang.zulipchat.com">

<head><link href="https://rust-lang.github.io/zulip_archive/style.css" rel="stylesheet"></head>

<a name="233173744"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/122651-general/topic/Writing%20a%20filesystem%20crate%20without%20regrets/near/233173744" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Colin Finck <a href="https://rust-lang.github.io/zulip_archive/stream/122651-general/topic/Writing.20a.20filesystem.20crate.20without.20regrets.html#233173744">(Apr 05 2021 at 13:47)</a>:</h4>
<p>Hey all! This is roughly the current design of the filesystem crate I'm working on:</p>
<div class="codehilite"><pre><span></span><code>pub struct FileSystem&lt;T: io::Read + io::Seek&gt; {
    partition: T,
}

pub struct Files&lt;&#39;a, T: io::Read + io::Seek&gt; {
    fs: &amp;&#39;a mut FileSystem&lt;T&gt;,
    current_offset: u64,
    last_offset: u64,
}

impl&lt;&#39;a, T&gt; Iterator for Files&lt;&#39;a, T: io::Read + io::Seek&gt; {
    type Item = io::Result&lt;File&lt;&#39;a, T&gt;&gt;;

    fn next(&amp;mut self) -&gt; Option&lt;Self::Item&gt; {
        ...
    }
}

pub struct File&lt;&#39;a, T: io::Read + io::Seek&gt; {
    fs: &amp;&#39;a mut FileSystem&lt;T&gt;,
    name: [u8; 255],
}

impl&lt;&#39;a, T&gt; io::Read for File&lt;&#39;a, T: io::Read + io::Seek&gt; {
    fn read(&amp;mut self, buf: &amp;mut [u8]) -&gt; Result&lt;usize&gt; {
        ...
    }
}

impl&lt;&#39;a, T&gt; io::Seek for File&lt;&#39;a, T: io::Read + io::Seek&gt; {
    fn seek(&amp;mut self, pos: SeekFrom) -&gt; Result&lt;u64, Self::Error&gt; {
        ...
    }
}
</code></pre></div>
<p>Now Rust's borrow checker and the design of <code>std::io::Read</code> are giving me a hard time to actually implement my plan.<br>
<code>fn read</code> needs a <code>&amp;mut self</code>, so I need to carry mutable references to <code>FileSystem</code> in <code>Files</code> and <code>File</code> just to be able to read.<br>
However, when doing that, I cannot return a <code>File</code> from <code>Files::next</code> due to the lifetime limitations of <code>Iterator</code>.</p>
<p>I have a few ideas around this, but none that really convinces me. It feels a lot like "pick your poison":</p>
<ol>
<li>
<p><code>FileSystem::partition</code> becomes a <code>RefCell&lt;T&gt;</code> (like the fatfs crate does)</p>
<p>- Advantages: All references to <code>FileSystem</code> can be constant. <code>Iterator</code>, <code>io::Read</code>/<code>io::Seek</code>, and even <code>io::Write</code> could be implemented without any hassles.<br>
  - Disadvantages: Borrow-checking happens at runtime. I lose fundamental guarantees of the Rust compiler, and contributors need to be extra-cautious to not cause any panics.</p>
</li>
<li>
<p>Using the positioned-io crate (like the ext4 crate does)</p>
<p>- Advantages: Reading is possible with a constant reference to <code>FileSystem</code>.<br>
  - Disadvantages: This is no solution for later adding write support, and incompatible to <code>std::io</code>.</p>
</li>
<li>
<p>Not implementing <code>Iterator</code>, but just <code>fn next</code> (like the streaming-iterator crate does)</p>
<p>- Advantages: <code>io::Read</code>, <code>io::Seek</code>, and <code>io::Write</code> can be implemented.<br>
  - Disadvantages: I lose all advantages of <code>Iterator</code>, like for loops or combining my iterator with others (<code>map</code>, <code>zip</code>, etc.).<br>
    By holding mutable references, I'm also subject to very rigid lifetime restrictions. Two simultaneous <code>File</code> instances aren't possible, and this is a likely case.</p>
</li>
<li>
<p>Only store small frequently accessed data (like the name) inside <code>File</code>, pass a temporary mutable reference to <code>FileSystem</code> every time we access the file content</p>
<p>- Advantages: I can implement <code>Iterator</code> and have two simultaneous <code>File</code> instances.<br>
  - Disadvantages: I cannot implement <code>io::Read</code>/<code>io::Seek</code>/<code>io::Write</code> on <code>File</code>, as their function signatures don't take a <code>&amp;mut FileSystem</code>.<br>
    A caller may also pass me a different <code>&amp;mut FileSystem</code> than I expect.</p>
</li>
</ol>
<p>Any advice, other ideas, or good examples that help me choose here?</p>



<a name="233174162"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/122651-general/topic/Writing%20a%20filesystem%20crate%20without%20regrets/near/233174162" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> bjorn3 <a href="https://rust-lang.github.io/zulip_archive/stream/122651-general/topic/Writing.20a.20filesystem.20crate.20without.20regrets.html#233174162">(Apr 05 2021 at 13:51)</a>:</h4>
<p>Read is implemented for &amp;std::fs::File. It is possible that Seek is also implemented for &amp;std::fs::File.</p>



<a name="233174213"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/122651-general/topic/Writing%20a%20filesystem%20crate%20without%20regrets/near/233174213" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> bjorn3 <a href="https://rust-lang.github.io/zulip_archive/stream/122651-general/topic/Writing.20a.20filesystem.20crate.20without.20regrets.html#233174213">(Apr 05 2021 at 13:52)</a>:</h4>
<p>Just make sure that your types don't implement Sync to prevent race conditions between seeking and reading.</p>



<a name="233174596"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/122651-general/topic/Writing%20a%20filesystem%20crate%20without%20regrets/near/233174596" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Mario Carneiro <a href="https://rust-lang.github.io/zulip_archive/stream/122651-general/topic/Writing.20a.20filesystem.20crate.20without.20regrets.html#233174596">(Apr 05 2021 at 13:55)</a>:</h4>
<p>Do you consider it a logic bug to have the same file open twice?</p>



<a name="233174792"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/122651-general/topic/Writing%20a%20filesystem%20crate%20without%20regrets/near/233174792" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Mario Carneiro <a href="https://rust-lang.github.io/zulip_archive/stream/122651-general/topic/Writing.20a.20filesystem.20crate.20without.20regrets.html#233174792">(Apr 05 2021 at 13:57)</a>:</h4>
<p>Is it possible for the filesystem implementation to handle concurrent modification of distinct files? The same file?</p>



<a name="233175384"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/122651-general/topic/Writing%20a%20filesystem%20crate%20without%20regrets/near/233175384" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Mario Carneiro <a href="https://rust-lang.github.io/zulip_archive/stream/122651-general/topic/Writing.20a.20filesystem.20crate.20without.20regrets.html#233175384">(Apr 05 2021 at 14:02)</a>:</h4>
<p>My guess is that you need something like (1), except that locking the entire filesystem on any file mutation is too strict, so you will have to split it into chunks and lock them individually.</p>



<a name="233215773"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/122651-general/topic/Writing%20a%20filesystem%20crate%20without%20regrets/near/233215773" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Colin Finck <a href="https://rust-lang.github.io/zulip_archive/stream/122651-general/topic/Writing.20a.20filesystem.20crate.20without.20regrets.html#233215773">(Apr 05 2021 at 19:11)</a>:</h4>
<p><span class="user-mention" data-user-id="133247">@bjorn3</span> Nice find! Not sure I can apply it to my case though. If my <code>FileSystem::partition</code> field takes something that implements <code>std::io::Read</code>, I can still only call <code>filesystem.partition.read</code> on a <code>&amp;mut FileSystem</code>, not a <code>&amp;FileSystem</code>.</p>



<a name="233219375"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/122651-general/topic/Writing%20a%20filesystem%20crate%20without%20regrets/near/233219375" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Colin Finck <a href="https://rust-lang.github.io/zulip_archive/stream/122651-general/topic/Writing.20a.20filesystem.20crate.20without.20regrets.html#233219375">(Apr 05 2021 at 19:42)</a>:</h4>
<p><span class="user-mention" data-user-id="271719">@Mario Carneiro</span> These questions need to be asked indeed. Ultimately, someone should be able to write a perfect filesystem crate supporting all these things of course :)<br>
What I know for sure already: I cannot lock the partition on a byte slice basis, as every data range on a filesystem can refer to any other data range.</p>
<p>I had hoped to hear about other good examples. Basically every crate that parses a binary format should face similar problems.<br>
Am I really the first one trying to write such a universal Rust filesystem crate? <span aria-label="thinking" class="emoji emoji-1f914" role="img" title="thinking">:thinking:</span></p>



<a name="233219463"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/122651-general/topic/Writing%20a%20filesystem%20crate%20without%20regrets/near/233219463" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Jake Goulding <a href="https://rust-lang.github.io/zulip_archive/stream/122651-general/topic/Writing.20a.20filesystem.20crate.20without.20regrets.html#233219463">(Apr 05 2021 at 19:42)</a>:</h4>
<p>"universal" is often a Bad Word in my experience. Why would you need such a thing?</p>



<a name="233222821"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/122651-general/topic/Writing%20a%20filesystem%20crate%20without%20regrets/near/233222821" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Colin Finck <a href="https://rust-lang.github.io/zulip_archive/stream/122651-general/topic/Writing.20a.20filesystem.20crate.20without.20regrets.html#233222821">(Apr 05 2021 at 20:10)</a>:</h4>
<p>Let's say I want to write an operating system and a bootloader for that operating system. Both need to access files on a filesystem - the OS with full read/write and concurrency support while the bootloader only needs basic read features.<br>
I don't want to duplicate work and share as much filesystem code as possible in a crate. Projects in C usually have two distinct implementations, but I'm sure Rust can do better than that</p>



<a name="233225658"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/122651-general/topic/Writing%20a%20filesystem%20crate%20without%20regrets/near/233225658" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Mario Carneiro <a href="https://rust-lang.github.io/zulip_archive/stream/122651-general/topic/Writing.20a.20filesystem.20crate.20without.20regrets.html#233225658">(Apr 05 2021 at 20:36)</a>:</h4>
<p><span class="user-mention" data-user-id="350190">@Colin Finck</span>  I think you should think about how the actual underlying data races are resolved to determine what the borrowing structure is. You say that if you want the backing store to be an arbitrary implementation of <code>io::Read</code> then you need mutable access. Why is this? Because the object might be loading and caching read bytes. So now suppose you have multiple files open and they each want to read some data. They are going to have to query the backing store, which is going to need to get mutated to retrieve the data. That can't happen concurrently, so you either need to assume more about the backing store (for example, it's just a byte array, or a <code>fs::File</code>, so the read implementation can just be a cursor over an immutable backing store), or the two files need to synchronize the mutation, meaning a <code>Mutex</code> or <code>RefCell</code>. This isn't "losing the fundamental guarantees of the Rust compiler", it is expressing the required borrowing patterns using the type system.</p>



<a name="233225694"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/122651-general/topic/Writing%20a%20filesystem%20crate%20without%20regrets/near/233225694" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Jake Goulding <a href="https://rust-lang.github.io/zulip_archive/stream/122651-general/topic/Writing.20a.20filesystem.20crate.20without.20regrets.html#233225694">(Apr 05 2021 at 20:36)</a>:</h4>
<p>Ah, so universal doesn't mean "an abstraction over HFS and ZFS and EXT3/4 and JFS and NFS and ..."</p>



<a name="233226368"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/122651-general/topic/Writing%20a%20filesystem%20crate%20without%20regrets/near/233226368" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Jake Goulding <a href="https://rust-lang.github.io/zulip_archive/stream/122651-general/topic/Writing.20a.20filesystem.20crate.20without.20regrets.html#233226368">(Apr 05 2021 at 20:42)</a>:</h4>
<p>You want to actually <em>implement</em> EXT3/4 or FAT or whatever. You "just" want both a small code version and a full-featured version of it.</p>



<a name="233247029"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/122651-general/topic/Writing%20a%20filesystem%20crate%20without%20regrets/near/233247029" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> The 8472 <a href="https://rust-lang.github.io/zulip_archive/stream/122651-general/topic/Writing.20a.20filesystem.20crate.20without.20regrets.html#233247029">(Apr 05 2021 at 23:52)</a>:</h4>
<p>You could also use something like <a href="https://man7.org/linux/man-pages/man2/pwrite.2.html">pread</a>, as represented by <a href="https://doc.rust-lang.org/std/os/unix/fs/trait.FileExt.html">os::unix::fs::FileExt</a>  instead of Read + Seek. That avoids the need to keep state between seeking and reading.</p>



<a name="233343783"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/122651-general/topic/Writing%20a%20filesystem%20crate%20without%20regrets/near/233343783" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Colin Finck <a href="https://rust-lang.github.io/zulip_archive/stream/122651-general/topic/Writing.20a.20filesystem.20crate.20without.20regrets.html#233343783">(Apr 06 2021 at 15:49)</a>:</h4>
<p><span class="user-mention" data-user-id="271719">@Mario Carneiro</span> Thanks for the reminder about a cached <code>io::Read</code> implementor. I had only thought about the inner cursor, which already explained the need for <code>&amp;mut</code>.<br>
Still, I look to avoid <code>RefCell</code>, as the compiler happily accepts two subsequent <code>borrow_mut</code> calls, only screwing me at runtime, whereas it outright denies two regular mutable borrows. Guess I will end up there anyway though after thinking through all locking boundaries again.</p>



<a name="233343868"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/122651-general/topic/Writing%20a%20filesystem%20crate%20without%20regrets/near/233343868" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Colin Finck <a href="https://rust-lang.github.io/zulip_archive/stream/122651-general/topic/Writing.20a.20filesystem.20crate.20without.20regrets.html#233343868">(Apr 06 2021 at 15:49)</a>:</h4>
<p><span class="user-mention" data-user-id="116155">@Jake Goulding</span> Exactly! In particular, this is becoming a crate around the NTFS filesystem.</p>



<a name="233344039"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/122651-general/topic/Writing%20a%20filesystem%20crate%20without%20regrets/near/233344039" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Colin Finck <a href="https://rust-lang.github.io/zulip_archive/stream/122651-general/topic/Writing.20a.20filesystem.20crate.20without.20regrets.html#233344039">(Apr 06 2021 at 15:50)</a>:</h4>
<p><span class="user-mention" data-user-id="330154">@The 8472</span> Yep, just like my proposal (2) employing the positioned-io crate. Only helps for reading, not writing though.</p>



<a name="233365933"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/122651-general/topic/Writing%20a%20filesystem%20crate%20without%20regrets/near/233365933" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Mario Carneiro <a href="https://rust-lang.github.io/zulip_archive/stream/122651-general/topic/Writing.20a.20filesystem.20crate.20without.20regrets.html#233365933">(Apr 06 2021 at 18:25)</a>:</h4>
<p><span class="user-mention" data-user-id="350190">@Colin Finck</span> I think you should use a <code>Mutex</code> rather than a <code>RefCell</code>, since in the case of concurrent modifications to the device, I guess you want to block instead of crash the program or othewise return an error. (Remember that <code>RefCell</code> also has APIs that allow it to report an error instead of panicking, and you can turn that error into an <code>io::Error</code>, which any downstream consumer should already be prepared to handle,  so it's not completely hopeless.)</p>



<a name="233444803"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/122651-general/topic/Writing%20a%20filesystem%20crate%20without%20regrets/near/233444803" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Colin Finck <a href="https://rust-lang.github.io/zulip_archive/stream/122651-general/topic/Writing.20a.20filesystem.20crate.20without.20regrets.html#233444803">(Apr 07 2021 at 07:30)</a>:</h4>
<p><span class="user-mention" data-user-id="271719">@Mario Carneiro</span> Thinking more about this, I'm leaning towards my proposal (4) now: Let none of my structures own the partition bytes (or just a mutable reference to it) and temporarily pass it to all functions that require it.<br>
This gives the caller the ability to perform locking as necessary (a full-fledged kernel needs more granular locking than a single-threaded bootloader) and avoids making choices deep in my crate now that I will regret later. For example, a <code>Mutex</code> isn't no-std aware, which is very important for writing a bootloader.<br>
(Granted, <code>std::io</code> isn't no-std aware either, but there are at least 5 no-std implementations of the same traits)</p>
<p><span class="user-mention" data-user-id="307680">@Kitlith</span> also hinted me to the petgraph crate, which follows a similar pattern through its <code>detach</code> function that converts a borrowing <code>Neighbors</code> struct into a non-borrowing <code>WalkNeighbors</code> struct: <a href="https://docs.rs/petgraph/0.5.1/petgraph/graph/struct.Neighbors.html#method.detach">https://docs.rs/petgraph/0.5.1/petgraph/graph/struct.Neighbors.html#method.detach</a></p>



<a name="233460389"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/122651-general/topic/Writing%20a%20filesystem%20crate%20without%20regrets/near/233460389" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Mario Carneiro <a href="https://rust-lang.github.io/zulip_archive/stream/122651-general/topic/Writing.20a.20filesystem.20crate.20without.20regrets.html#233460389">(Apr 07 2021 at 09:55)</a>:</h4>
<blockquote>
<p>For example, a <code>Mutex</code> isn't no-std aware, which is very important for writing a bootloader.</p>
</blockquote>
<p>Well, here again, there is the question of "why" that leads to a design question about your own library: Why is <code>Mutex</code> not no_std? Because it needs a mechanism to handle blocking and waking threads, that may be OS dependent. So the question becomes, what do you actually want to happen in this case? If you want a non-blocking solution, the <code>RefCell</code> version does that. If you want to parameterize over the behavior, you can supply the handler via a trait implementation.</p>
<p>One danger of the <code>detach</code> approach is that you lose some type safety in the detached objects; for example you could use a <code>WalkNeighbors</code> from one graph and index into another. This not only leads to some API unsafety, it also means that you need validity checks whenever you use the object (in the example that would be <code>WalkNeighbors::next_edge</code>). I think it is better for the <code>FileReader</code> structure to be borrowed, like <code>std::vec::Iter&lt;'a, T&gt;</code>, but rather than borrowing the entire filesystem it only borrows what it needs to in order to support reading, using <code>RefCell</code> or an equivalent at the relevant level of granularity in the guts of the filesystem.</p>
<p>(Note that what is actually borrowed need not be some chunk of memory, it can just be a token that means "I'm reading this file" with the actual synchronization mechanism being implemented with raw pointers and unsafe, assuming you can implement a safe abstraction for it with the property that disjoint files can be read concurrently without synchronization.)</p>



<hr><p>Last updated: Aug 07 2021 at 22:04 UTC</p>
</html>